Skip to content

Refactor PIINVOICE so env vars are fetched at start of billing pipeline.#179

Open
KelvinLinBU wants to merge 1 commit intoCCI-MOC:mainfrom
KelvinLinBU:refactor-env
Open

Refactor PIINVOICE so env vars are fetched at start of billing pipeline.#179
KelvinLinBU wants to merge 1 commit intoCCI-MOC:mainfrom
KelvinLinBU:refactor-env

Conversation

@KelvinLinBU
Copy link
Copy Markdown
Member

@KelvinLinBU KelvinLinBU commented Apr 22, 2025

Closes #168. Adhere to testing best practices as outlined in #159 (comment) and #168 (comment). Modify utils.py to accommodate changes.

@QuanMPhm QuanMPhm requested review from QuanMPhm and larsks and removed request for QuanMPhm April 22, 2025 14:56
@KelvinLinBU KelvinLinBU self-assigned this May 14, 2025
@KelvinLinBU KelvinLinBU force-pushed the refactor-env branch 3 times, most recently from 63d9226 to 6d1bdef Compare May 29, 2025 16:02
@KelvinLinBU
Copy link
Copy Markdown
Member Author

@QuanMPhm can be merged with main now

- NewPICreditProcessor
"""

chrome_binary_location: str
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you move this arg below the constants. In this codebase, the conventional structure is class constants, then variables, then functions

chrome_binary_location = os.environ.get(
"CHROME_BIN_PATH", "/usr/bin/chromium"
)
if not os.path.exists(chrome_binary_location):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You do not need to move this check to process_report.py. We only want to move the fetching of the env var. Error checking can stay in the invoice class for now.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should add CHROME_BIN_PATH in REQUIRED_ENV_VARS

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no reason to perform validation in two different places. If we're using pydantic_settings to manage environment variables, then rather than using required_env_files here, just mark chrome_bin_path as required in process_report.settings.Settings. Currently it is optional:

    chrome_bin_path: str | None = None

To make it required:

    chrome_bin_path: str

This will cause pydantic to throw a validation error when instantiating Settings:

pydantic_core._pydantic_core.ValidationError: 1 validation error for Settings
chrome_bin_path
  Field required [type=missing, input_value={}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.10/v/missing

Similarly, you can remove the check for KEYCLOAK_CLIENT_ID and KEYCLOAK_CLIENT_SECRET here by adding a validator to Settings:

from pydantic import model_validator
from pydantic_settings import BaseSettings

class Settings(BaseSettings):
    ...

    
    @model_validator(mode="after")
    def check_keycloak_auth(self):
        if not self.coldfront_api_filepath and not (
            self.keycloak_client_id and self.keycloak_client_secret
        ):
            raise ValueError(
                "You must either set coldfront_api_filepath or provide keycloak credentials in "
                "KEYCLOAK_CLIENT_ID and KEYCLOAK_CLIENT_SECRET"
            )

You probably want to catch the exception from pydantic to produce a more friendly error message that does not include a full Python traceback.

def main():
"""Remove non-billable PIs and projects"""

chrome_binary_location = os.environ.get("CHROME_BIN_PATH", "/usr/bin/chromium")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to the adding CHROME_BIN_PATH to REQUIRED_ENV_VARS, you should have the env var fetched after the env vars have been validated

@QuanMPhm QuanMPhm force-pushed the refactor-env branch 2 times, most recently from 4f89e7d to 0c340e2 Compare July 22, 2025 19:50
@QuanMPhm QuanMPhm requested review from knikolla and naved001 July 24, 2025 15:01
@QuanMPhm QuanMPhm dismissed their stale review August 1, 2025 14:13

I am now a contributor to this PR as well

subprocess.run(
[
CHROME_BIN_PATH,
os.environ.get("CHROME_BIN_PATH", "/usr/bin/chromium"),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the motivation for this change? I think it's generally better practice to read your environment variables early, rather than doing it inline like this.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've written so that the env var is loaded by the settings module

@QuanMPhm
Copy link
Copy Markdown
Contributor

QuanMPhm commented Mar 15, 2026

@larsks @knikolla After half a year, I want to wrap up this PR

…ipeline.

This makes use of `required_env_vars()`.
name: Run unit tests
runs-on: ubuntu-latest
env:
CHROME_BIN_PATH: /usr/foo/chromium
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we setting CHROME_BIN_PATH to an invalid path here?

Copy link
Copy Markdown
Contributor

@QuanMPhm QuanMPhm Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unit tests were designed to not require integration with 3rd-party tools. We do have an actual integration test for Chrome that installs Chromium and sets a real path.

As for why the env var is set, rather than not set at all, I wanted to check that code actual read the env var CHROME_BIN_PATH, whatever it maybe.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no reason to perform validation in two different places. If we're using pydantic_settings to manage environment variables, then rather than using required_env_files here, just mark chrome_bin_path as required in process_report.settings.Settings. Currently it is optional:

    chrome_bin_path: str | None = None

To make it required:

    chrome_bin_path: str

This will cause pydantic to throw a validation error when instantiating Settings:

pydantic_core._pydantic_core.ValidationError: 1 validation error for Settings
chrome_bin_path
  Field required [type=missing, input_value={}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.10/v/missing

Similarly, you can remove the check for KEYCLOAK_CLIENT_ID and KEYCLOAK_CLIENT_SECRET here by adding a validator to Settings:

from pydantic import model_validator
from pydantic_settings import BaseSettings

class Settings(BaseSettings):
    ...

    
    @model_validator(mode="after")
    def check_keycloak_auth(self):
        if not self.coldfront_api_filepath and not (
            self.keycloak_client_id and self.keycloak_client_secret
        ):
            raise ValueError(
                "You must either set coldfront_api_filepath or provide keycloak credentials in "
                "KEYCLOAK_CLIENT_ID and KEYCLOAK_CLIENT_SECRET"
            )

You probably want to catch the exception from pydantic to produce a more friendly error message that does not include a full Python traceback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Refactor invoices so all env vars are fetched at the start of billing pipeline

3 participants